Search CORE

20 research outputs found

パターン照合問題に対する高速なアルゴリズム

Author: Diptarama Hendrian
Publication venue
Publication date: 27/03/2018
Field of study

Tohoku University篠原歩課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Online Algorithms for Constructing Linear-Size Suffix Trie

Author: Hendrian Diptarama
Inenaga Shunsuke
Takagi Takuya
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Publication date: 01/01/2019
Field of study

The suffix trees are fundamental data structures for various kinds of string processing. The suffix tree of a string T of length n has O(n) nodes and edges, and the string label of each edge is encoded by a pair of positions in T. Thus, even after the tree is built, the input text T needs to be kept stored and random access to T is still needed. The linear-size suffix tries (LSTs), proposed by Crochemore et al. [Linear-size suffix tries, TCS 638:171-178, 2016], are a "stand-alone" alternative to the suffix trees. Namely, the LST of a string T of length n occupies O(n) total space, and supports pattern matching and other tasks in the same efficiency as the suffix tree without the need to store the input text T. Crochemore et al. proposed an offline algorithm which transforms the suffix tree of T into the LST of T in O(n log sigma) time and O(n) space, where sigma is the alphabet size. In this paper, we present two types of online algorithms which "directly" construct the LST, from right to left, and from left to right, without constructing the suffix tree as an intermediate structure. Both algorithms construct the LST incrementally when a new symbol is read, and do not access to the previously read symbols. The right-to-left construction algorithm works in O(n log sigma) time and O(n) space and the left-to-right construction algorithm works in O(n (log sigma + log n / log log n)) time and O(n) space. The main feature of our algorithms is that the input text does not need to be stored

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

In-Place Bijective Burrows-Wheeler Transforms

Author: Hashimoto Daiki
Hendrian Diptarama
Shinohara Ayumi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)
Publication date: 01/01/2020
Field of study

One of the most well-known variants of the Burrows-Wheeler transform (BWT) [Burrows and Wheeler, 1994] is the bijective BWT (BBWT) [Gil and Scott, arXiv 2012], which applies the extended BWT (EBWT) [Mantaci et al., TCS 2007] to the multiset of Lyndon factors of a given text. Since the EBWT is invertible, the BBWT is a bijective transform in the sense that the inverse image of the EBWT restores this multiset of Lyndon factors such that the original text can be obtained by sorting these factors in non-increasing order. In this paper, we present algorithms constructing or inverting the BBWT in-place using quadratic time. We also present conversions from the BBWT to the BWT, or vice versa, either (a) in-place using quadratic time, or (b) in the run-length compressed setting using ?(n lg r / lg lg r) time with ?(r lg n) bits of words, where r is the sum of character runs in the BWT and the BBWT

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Inferring Strings from Position Heaps in Linear Time

Author: Hendrian Diptarama
Kumagai Koshiro
Shinohara Ayumi
Yoshinaka Ryo
Publication venue
Publication date: 12/12/2022
Field of study

Position heaps are index structures of text strings used for the string matching problem. They are rooted trees whose edges and nodes are labeled and numbered, respectively. This paper is concerned with variants of the inverse problem of position heap construction and gives linear-time algorithms for those problems. The basic problem is to restore a text string from a rooted tree with labeled edges and numbered nodes. In the variant problems, the input trees may miss edge labels or node numbers which we must restore as well.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive